Multi-Task and Multi-Modal Learning for RGB Dynamic Gesture Recognition
نویسندگان
چکیده
Gesture recognition is getting more and popular due to various application possibilities in human-machine interaction. Existing multi-modal gesture systems take data as input improve accuracy, but such methods require modality sensors, which will greatly limit their scenarios. Therefore we propose an end-to-end multi-task learning framework training 2D convolutional neural networks. The can use the depth accuracy during save costs by using only RGB inference. Our trained learn a representation for learning: segmentation recognition. Depth contains prior information location of gesture. it be used supervision segmentation. A plug-and-play module named Multi-Scale-Decoder designed realize segmentation, two sub-decoder. It lower stage higher respectively, help network pay attention key target areas, ignore irrelevant information, extract discriminant features. Additionally, MSD are performance. Only without required Experimental results on three public datasets show that our proposed method provides superior performance compared with existing frameworks. Moreover, other CNN-based frameworks also get excellent improvement.
منابع مشابه
Challenges in Multi-modal Gesture Recognition
This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015. We began right at the start of the KinectT Mrevolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras,...
متن کاملMulti-modal Unsupervised Feature Learning for RGB-D Scene Labeling
Most of the existing approaches for RGB-D indoor scene labeling employ hand-crafted features for each modality independently and combine them in a heuristic manner. There has been some attempt on directly learning features from raw RGB-D data, but the performance is not satisfactory. In this paper, we adapt the unsupervised feature learning technique for RGB-D labeling as a multi-modality learn...
متن کاملMulti-Modal Multi-Task Deep Learning for Autonomous Driving
Several deep learning approaches have been applied to the autonomous driving task, many employing end-toend deep neural networks. Autonomous driving is complex, utilizing multiple behavioral modalities ranging from lane changing to turning and stopping. However, most existing approaches do not factor in the different behavioral modalities of the driving task into the training strategy. This pap...
متن کاملMulti-modal Multi-task Learning for Automatic Dietary Assessment
We investigate the task of automatic dietary assessment: given meal images and descriptions uploaded by real users, our task is to automatically rate the meals and deliver advisory comments for improving users’ diets. To address this practical yet challenging problem, which is multi-modal and multi-task in nature, an end-to-end neural model is proposed. In particular, comprehensive meal represe...
متن کاملBayesian Co-Boosting for Multi-modal Gesture Recognition
With the development of data acquisition equipment, more and more modalities become available for gesture recognition. However, there still exist two critical issues for multimodal gesture recognition: how to select discriminative features for recognition and how to fuse features from different modalities. In this paper, we propose a novel Bayesian Co-Boosting framework for multi-modal gesture ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Sensors Journal
سال: 2021
ISSN: ['1558-1748', '1530-437X']
DOI: https://doi.org/10.1109/jsen.2021.3123443